Given a natural language that describes the user's demands, the NL2Code task aims to generate code that addresses the demands. This is a critical but challenging task that mirrors the capabilities of AI-powered programming. The NL2Code task is inherently versatile, diverse and complex. For example, a demand can be described in different languages, in different formats, and at different levels of granularity. This inspired us to do this survey for NL2Code. In this survey, we focus on how does neural network (NN) solves NL2Code. We first propose a comprehensive framework, which is able to cover all studies in this field. Then, we in-depth parse the existing studies into this framework. We create an online website to record the parsing results, which tracks existing and recent NL2Code progress. In addition, we summarize the current challenges of NL2Code as well as its future directions. We hope that this survey can foster the evolution of this field.
translated by 谷歌翻译
基于骨架的动作识别方法受到时空骨骼图的语义提取的限制。但是,当前方法在有效地结合时间和空间图尺寸的特征方面很难,一侧往往厚度厚,另一侧较薄。在本文中,我们提出了一个时间通道聚合图卷积网络(TCA-GCN),以动态有效地学习基于骨架动作识别的不同时间和通道维度中的空间和时间拓扑。我们使用时间聚合模块来学习时间维特征和通道聚合模块,以有效地将空间动态通道拓扑特征与时间动态拓扑特征相结合。此外,我们在时间建模上提取多尺度的骨骼特征,并将其与注意机制融合。广泛的实验表明,在NTU RGB+D,NTU RGB+D 120和NW-UCLA数据集上,我们的模型结果优于最先进的方法。
translated by 谷歌翻译
代码生成是一个长期的挑战,旨在根据自然语言描述生成代码段。通常,昂贵的文本编码配对数据对于培训代码生成模型至关重要。最近,由于培训预培训技术的成功,大型语言模型接受了大规模未标记的代码语料库的培训,并在代码生成方面表现良好。在本文中,我们调查了如何利用未标记的代码语料库来训练以图书馆为导向的代码生成的模型。由于对于程序员重复使用第三方库是一种普遍的做法,因此由于库数量大量,文本编码配对数据很难获得。我们观察到面向图书馆的代码片段更有可能共享类似的代码草图。因此,我们为证书提供了两个步骤:草图器生成草图,然后发电机填充了草图中的详细信息。 Sketcher和Generator都使用未标记的数据在基本模型上不断预先训练。此外,我们制作了两个名为Pandaseval和NumpyeVal的基准,以评估面向图书馆的代码生成。实验结果证明了CERT的表现令人印象深刻。例如,它超过了基本模型,在pandaseval上的Pass@1方面,绝对提高了15.67%。我们的工作可在https://github.com/microsoft/pycodegpt上获得。
translated by 谷歌翻译
随着智能设备和物联网无处不在的部署的出现,机器学习推断的数据源已越来越多地转移到网络的边缘。现有的机器学习推理平台通常假设一个均匀的基础架构,并且不考虑包括边缘设备,本地集线器,边缘数据中心和云数据中心的更复杂和分层的计算基础架构。另一方面,最近的Automl工作为异质环境提供了可行的解决方案,用于模型压缩,修剪和量化。对于机器学习模型,现在我们可能很容易找到甚至生成一系列在准确性和效率之间进行不同权衡的模型。我们设计和实施Jellybean,这是一种用于服务和优化机器学习推理工作流程的系统。给定的服务级目标(例如,吞吐量,准确性),Jellybean选择了满足准确性目标的最具成本效益的模型,并决定如何在基础架构的不同层次上部署它们。评估表明,与最先进的模型选择和工人分配解决方案相比,Jellybean的视觉问题回答总成本最高可达58%,而NVIDIA AI City Challenge的车辆跟踪最多可达36%。 Jellybean还优于先前的ML服务系统(例如,在云上火花)的服务成本高达5倍。
translated by 谷歌翻译
联合学习(FL)以来已提议已应用于许多领域,例如信用评估,医疗等。由于网络或计算资源的差异,客户端可能不会同时更新其渐变可能需要花费等待或闲置的时间。这就是为什么需要异步联合学习(AFL)方法。AFL中的主要瓶颈是沟通。如何在模型性能和通信成本之间找到平衡是AFL的挑战。本文提出了一种新的AFL框架VAFL。我们通过足够的实验验证了算法的性能。实验表明,VAFL可以通过48.23 \%的平均通信压缩速率降低约51.02 \%的通信时间,并允许模型更快地收敛。代码可用于\ url {https://github.com/robai-lab/vafl}
translated by 谷歌翻译
关于数据隐私和安全性的越来越多的担忧驱动了从孤立的数据源,即联合学习的隐私保留机学习的新兴领域。一类联合学习,\ Texit {垂直联合学习},不同的各方对共同用户的不同特征,具有促进许多领域企业之间各种业务合作的潜力。在机器学习中,诸如梯度提升决策树(GBDT)和随机森林等决策树集合被广泛应用强大的型号,具有高的可解释性和建模效率。然而,最先进的垂直联合学习框架适应匿名功能以避免可能的数据泄露,使模型受到损害的可解释性。为了解决推理过程中的这个问题,在本文中,我们首先在垂直联合学习中对客场党的特征披露含义的必要性进行了问题分析。然后,我们发现树的预测结果可以表示为所有各方持有的树的子模型结果的交叉点。利用这种关键观察,我们通过隐藏决策路径来保护数据隐私并允许公开特征含义,并适应推理输出的通信有效的安全计算方法。通过理论分析和广泛的数值结果,将证明FED-EINI的优点。我们通过披露特征的含义来提高模型的可解释性,同时确保效率和准确性。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译